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Abstract 

This article concerns tests for location parameters in cases where the data dimension 
is larger tha n the sample siz e. We propose a family of tests based on the optimality 
arguments in 


Le Cam 


(|l98fil ) under elliptical symmetric. The asymptotic normality of 


these tests are established. By maximizing the asymptotic power function, we propose 
an uniformly optimal test for all elliptical symmetric distributions. The optimality is 
also confirmed by a Monte Carlo investigation. 
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1 Introduction 


Testing the population mean vector is a fundamental problem in statistics. A classical 
method to deal with this problem is the famous Hotelling’s test. However, it can not work 
in high dimensional settings because the sample covariance matrix is not invertible. With the 
rapid development of technology, various types of high-dimensional data have been generated 
in many areas, such as internet portals, microarray analysis. By replacing the Mahalanobis 
distance by the Euclidean distance, many modih ed Hotelling’s tes t s for h i gh dimensional 


data are proposer 

in manv literatures, such a 

(2010) 

, Srivastava 

(2009 

), 

Fene. et ah 

12015b1 
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moment-based tests mentioned above would be degraded when the non-normality is severe, 
especially for heavy-tailed distributions. 

Many nonparametric methods have been developed, as a reaction to the Gaussian ap¬ 
proach of Hotelling’s test, with the objective of extending to the multivariate context the 
classical univariate rank and signe d- rank techniques. T here are three main groups. One 
relies on componentwise rankings fjPuri and Sen Il97ll) . but is not affine inv arian t . Th e 
second group is based on spatial signs and ranks with the so called Oja median flOial 120101) . 
Some efforts have been dev oted to extending this type of method to the high dimensional 


data. 


Wang, et ah (120151) propose a high dimens i onal sp atial sign test by replacing the 


scatter matrix with identity matrix. iFeng. et ah fl2015al) also propose a scalar-invariant 
high dimensional sign test for the two sample location problem. They demonstrate that the 


multivariate sign and rank are still very efficient methods in constructin, 


g robust 


dimension settings. The last group use the concep t of interdirections flRandles 


est in high 


an important work. 


Hallin aud Paindaveine 


19921) . In 


(120021) propose a class of tests based on interdi¬ 


rections and pseudo-Mahalanobis ranks. Depending on the score function considered, they 
allow for locally asymptotically maximin test at selected densities. However, to the best of 
our knowledge, there are no optimal tests for high dimensional location parameters. 

In this article, we propose a n uniform l y opt imal test for high dimensional data. Based 


on the optimality arguments in 


Le Cam 


(jl9861) . we introduce a high dimensional form of 


the locally and asymptotically optimal testing procedure. The asymptotic normality of this 
class of tests are established. By maximizing the power function of these tests, we propose an 
uniformly optimal test for high dimensional location problem. In the multivariate case, the 
optimal score function deeply depends on the underlying distributions. However, the optimal 
weighted function for our high dimensional test is unique. So our proposed test procedure is 
uniformly optimal for the elliptical symmetri c distributions. We a lso derive t he asymptotic 


relativ e efficiency of our test with respect to 


Chen and QinI (120101) ’s test and 


Wang, et al. 


(120151) ’s test. It is not surprised that they are all no less than one for the elliptical symmetric 
distributions. And for the heavy tailed distributions, such as multivariate f-distributions or 
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mixture multivariate uormal distributious, our test would perform eveutually better thau 
these two tests. Simulatiou studies also demoustrate these results. 


2 Uniformly Optimal Test 


2.1 High Dimensional Weighted Sign tests 

Assume {JAj}r=i are i.i.d. raudom sample from p-variate elliptically symmetric distributiou 
with deusity fuuctiou det(S)“^/^p(||S“^/^(x —6^)11) where 9 is the symmetry ceuters aud S is 
the positive dehuite symmetric px p scatter matrices. We cousider the followiug oue sample 
testiug problem 


Hq : 6 = 0 versus Hi : 6 ^ 0. 


( 1 ) 


Wheu the dimeusiou p is fixed, accordiug to the local asymptotic uormality theory (jLe_Cam 


19861) . the form of locally aud asymptotically optimal testiug procedures for ([T]) uuder spec- 
ihed S aud g is 

n n 

I 


P'S j^—i j—i 


where U{x) = a;/||a::||/(a; ^ 0), = —g'/g aud Cp^g is a scaled parameter. 


Halliu aud Paiudaveiue 


(120021 ) proposed a class of tests based ou iuterdirectious aud pseudo-Mahalauobis rauks which 
are of the asymptotic form 
2 


Rn = 


n{n — 1) 




i<j 


K{-) is a coutiuuous weighted fuuctio u. However, the scat 


dimeusioual settiugs. Motivated by 


Bai aud Sarauadasa 


er ma 


rix S is uot available iu hig h 


fllQOOl ) aud ICheu aud Qiul (120 lol ) , 


we simply replace S by Ip aud exclude the same term iu Rn. We propose the followiug 
geuerally weighted sigu test statistic: 


i<j 
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where rj = ||Xj||. 
statistic p roposed in 


et K(t) = t, W„ w ould be the one-sample high dimensional t-test 


sign test flWang. et ah 


Chen and QinI (120lOl) . Similarly, we can obtain the high dimensional 


2 OI 5 I) with K(t) = 1. We will determine the optimal weighted 


function K(t) in the next section. First, we propose an asymptotic analysis for hF„. 


R ecently, there are many high dirnensiona l scalar-invariant tests in literature flPark and Avvala 


2013 


Srivastava 


2009 


Feng, et ah 


2015a 


bl). The idea is replacing S by its diagonal ma¬ 


trix. And then all the variables have the same scale. Here we also standardize each variables 


hrst by the estimated diagonal matrix in iFeng. et ah (j2015al) . which make Wn invariant 


under the scale transformation. Details about the scalar-invariant test are given in the ap¬ 
pendix. To expedite our discussion, we assume the diagonal matrix of S are known and 
equal to one without loss of generality. 

The following conditions are needed. 

(Cl) tr(S‘^) = o(tr^(S^)) and tr(S^) — p = o{n~^p^). 

(C2) 1/4 = 0(z/|) where vi = E{K\ri)). 


The first condition in (Cl) is similar to condition (3.8) in IChen and QinI fl2010l) . Obviously, 


(Cl) will hold if all the eigenvalues of S are bounded. The second condition in Condition 
(Cl) is used to reduce the difference between the module ||£|| and Then, we can 


get an explicit re 
Assumption 1 in 


ationship 


Zou et ah 


Detween the variance of Wn and S. Condition (C2) is similar to 
( 2014 ) if we choose K{t) = 


Theorem 1 Under Conditions (C1)-(C2) and Hq, as {p,n) 00, 


Wn/an^N{0,l) 


where = 2n "^p ^r'|tr(S^). 


Similar to 


Wang, et ah fj20151 ). we propose the following ratio-consistent estimator of 
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where J2k^ij U{Xk). And then we reject the null hypothesis if Wn/^n > where 

Za is the upper a quantile of A^(0,1). 

Next, we consider the asymptotic distribution of Wn under the alternative hypothesis 
(C3) 6^6 = O(co^cr„), 6^116 = o(npco^cr„) where cq = E{K{ri)r~^}. 

Condition (C3) require the difference between /x and 0 is not large so that the variance of Wn 
is still asymptotic a^. It can be viewed as a high-dimensional version of the local alternative 
hypotheses. 


Theorem 2 Under Conditions (C1)-(C3), as {p,n) —?■ oo, we have 


0*77 


2.2 High Dimensional Optimal Sign test 


According to Theorem 1 and 2, the asymptotic power of our weighted sign test becomes 

[E{K{ri)r-^}]'^ pnO^O 


/^Ws(ll^ll) — ^ —Za + 




The power function of Wn is an increasing function of . By the Cauchy inequality, 


we have 


E{K2(ri}} 


[E{K{r,)rr^]f ^ E{K\r,)]E{rr^) ^ 


EiK-^ir^)] - E{K\ri)} 

The maximum of /Swsdl^ll) is with maximizer K(t) = Consequently, we propose 

the following high dimensional optimal sign test 

2 


= 


n{n — 1) 




i<j 


By Condition (Cl) and (C3), E{r- = A'(||£j|| ^)(1 + o(l)), £, = S — /x). So the 

power function of Tn is 

pnO^O 




V2tr(S' 
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Chen and Oin 

(2010 

) and 

Wane:, et ah 


proposed tests are 


(12015() show that the asymptotic power of their 


/3cq(I|0||) 

/5ss(||0||) 


$ 


-Za + 


npO^O 


B(ikiP)y2tr(^ 

npO^O 


^[-z^+{E{\\e\nr 


V2tr(S^ 


Thus, the asymptotic relative efficiency of our proposed test with respect to these two tests 
are 


ARE(OS,CQ) =E{\\e\\-^)E{\\e\\^) > 1 


ARE(OS,SS) 


E{\\e 


l-2'i 




= 1 + 


var £ 


1-1^ 




> 1 . 


Both of the above two equations only hold when 11£| l/Ed |£| |) A 1. If 11£| l/Ed |£| |) A these 
three tests are asymptotic equivalent. Otherwise, our proposed test would perform better 
than the other two tests. 

When Si ~ A^(0,Ip), \\si\\/^/p^ 1. Then, ARE(OS, CQ) and ARE(OS, SS) are all equal 
to one. 

When Sj ~ fp(0,lp,n), where tp(0,lp,n) is the standard p-dimensional multivariate t 
distribution with v degrees of freedom, we have 

ARE(OS. CQ) = ARE(OS. SS) = ' 

In this case, 'ipgit) = {p + v)t/{v +—)• pt~^ as t —?■ oo. So, our uniformly optimal weighted 
function K{t) would be consistent with the “optimal” weighted function '4^g{t). 

When Si is from the mixtures of two multivariate normal distributions MN{k, a, Ip) with 
density function (1 — n)fp{0, Ip) + K/p(0, n^Ip), where /p(;) is the density function of p-variate 
multivariate normal distribution, we have 


ARE(OS, CQ) = (1 - K + -k + ku^), ARE(0S, SS) = 


1 — K + k/ct^ 
(1 — K + K/ay 


As cr^ ^ cxo, ARE(OS, CQ) will be arbitrary large and ARE(OS, SS) will converge to 1/(1 — 
k). However, in this case, ^git) = ^ t as t ^ oo, which 
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is consistent with 


Chen and Qinl (l2010l) ’s test. So, K(t) = 'ipgit) would not be optimal i 


in 


such case. Thus, for high dimensional data, a simply extension of Qn with 'ipg{t) may not be 
always the best test. 

Table [T] reports asymptotic relative efficiency between these three tests under the mul¬ 
tivariate t-distributions with different degrees of freedom and mixture normal distributions. 
Formulas of asymptotic relative efficiency with these two distributions are given in the Sup¬ 
plementary Material. 


Table 1: Asymptotic relative efficiencies with different distributions. 



(0)Ip 1 3) 

tp (0, Ip, 4) 

ip (0; Ip ! 5) 

tp(0, Ip, 6) 

JV(O.Ip) 

MiV{0.2, 3, Ip) 

MiV(0.2, 10, Ip) 

MW(0.8, 10, Ip) 

ARE(SS,CQ) 

2.54 

1.76 

1.51 

1.38 

1.00 

2.06 

13.98 

6.28 

ARE(OS,CQ) 

3.00 

2.00 

1.67 

1.50 

1.00 

2.25 

16.68 

16.68 

ARE(OS,SS) 

1.18 

1.13 

1.11 

1.09 

1.00 

1.09 

1.19 

2.65 


tp(0, A,n), p-dimensional multivariate t distribution with v degrees of freedom and scatter 
matrix A; MN{K,a,A), mixture multivariate normal distribution with density function 
(1 - k)/p(0. A) -h k/p(0, ct^A), where /p(;) is the density function of p-variate multivariate 

normal distribution. 


3 Simulation 

Here we report a simulation study designed to evaluate the performance of the proposed test. 
All the simulation results are based on 2,500 replications. We consider the following hve el¬ 
liptical distributions: (I) A^(0,S); (II) S,3); (III) tp(0, S,4); (IV) MV(0.2,10, S); 
(V) MV(0.8,10, S) and two independent component model Xj = Zi = 

(Zji, • • • ,Zip) where (VI) Zij ~ ^ 3 ; (VII) ~ 0.8V(0,1) -|-0.2V(0,100). The scatter 
matrix is S = (O.Sl*”-^!). The sample size is n = 40 and the dimension is p = 200,400,800. 
Under the alternative hypothesis, two patterns of allocation are considered: (Dense case): 
the hrst 50% components of 6 are zeros; (Sparse case) the hrst 95% components of 6 are 
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zeros. 


And we fixed 0^0/ ^tr(S) = 0.1 for the first four scenarios (I)-(IV) and (VI), 
and 6/= 1 for scen ario (V) and ( VII). We compare our proposed test with 


Chen and QinI fl2010l) ’s test and 


Wang, et ah 


fj2015|) ’s test. Table |2] reports the empirical 


sizes and power of these three tests. All these tests can control the empirical sizes very well. 
For multivariate normal distribution and independent component model, the difference be¬ 
tween these three tests are negligible. It is not strange because ||£||/i/pAl in this case. 


normal cases, both 

Wang, et al. 

Chen and Oin 

f201C 

)’s test in all 


(120151) ’s test and our proposed test performs better than 


tests will perform b etter than those mom ent-based tests. Furthermore, our proposed test is 


more powerful than 


Wang, et aj. 


asymptotic analysis. Though 


(120151) ’s t e st in these cases, which is consistent with the 


Wang, et ah I (120151) ’s test is very powerful method, it loses 


all the information of the module of the observations. All these results suggest that our 
proposed test is very efficient and robust in a wide range of distributions. 

4 Discussion 


In this paper, we propose a weighted sign test and determine the “optimal” weight function 
by maximizing the power function. Our asymptotic and numerical results together suggest 
that the proposed optimal sign test is quite robust and efficient in testing the population 
mean vector. This article concerns the one sample location pro blem. Testing the equal 


ity of two samp 


Cai. Liu and Xia . 


e locations are also a ve r y important p r oblem (jSrivastava and Pul . 


2008 


2014 : 


Chen et ah 


2011 


Gregory et al 


20151) . In the two sample prob¬ 


lem, the common mean vector is not specihed and need to be estimated. How to extend our 
method deserves further study. Furthermore, the proposed test procedure is essentially devel¬ 


oped under the framework of L^-norm- 


Dased tests. In another direction. 


Cai. Liu and Xia 


(120141) and IZhong. Chen and Xul (120131) used the max-norm or thresholding approach to 


construct tests rather than the L 2 -norm. Generally speaking, the max-norm test is for 
















































































Table 2: Empirical sizes and power (%) comparison at 5% significance under Scenarios (I)- 

(V) 




Size 



Dense 



Sparse 


CQ 

SS 

OS 

CQ 

(n,p) = 

SS OS 

(40,200) 

CQ 

SS 

OS 

(I) 

5.8 

6.3 

6.2 

74.9 

76.6 

76.0 

81.0 

83.5 

82.8 

(11) 

4.5 

5.7 

6.2 

32.4 

68.2 

75.3 

33.6 

72.9 

78.7 

(III) 

5.1 

5.9 

5.7 

43.1 

68.9 

75.2 

46.3 

77.4 

82.3 

(IV) 

6.1 

7.1 

6.2 

9.0 

55.1 

63.7 

10.3 

60.6 

68.9 

(V) 

6.1 

7.0 

5.4 

12.6 

58.6 

94.7 

13.4 

64.1 

96.3 

(VI) 

6.6 

7.3 

5.4 

25.1 

29.7 

29.5 

27.4 

34.0 

34.3 

(VII) 

4.8 

5.1 

4.8 

34.8 

38.6 

39.4 

40.9 

45.3 

45.1 





(n,p) = 

(40,400) 




(I) 

5.2 

6.0 

5.9 

78.6 

80.1 

79.9 

80.3 

82.6 

82.3 

(II) 

4.3 

5.1 

4.7 

29.7 

68.1 

76.9 

31.9 

70.7 

79.4 

(III) 

4.9 

6.0 

6.6 

40.8 

73.7 

80.5 

43.1 

76.6 

80.9 

(IV) 

5.4 

6.5 

5.3 

8.3 

54.5 

65.3 

8.5 

59.0 

68.3 

(V) 

4.7 

6.9 

5.1 

10.6 

57.9 

95.2 

10.6 

59.9 

94.6 

(VI) 

3.2 

4.5 

4.7 

23.3 

27.2 

27.4 

24.2 

27.0 

26.4 

(VII) 

6.0 

7.0 

5.8 

34.8 

39.9 

39.7 

38.4 

41.4 

41.9 





{n,p) = 

(40,800) 




(I) 

4.2 

5.8 

5.4 

80.7 

82.4 

81.5 

78.4 

80.5 

80.1 

(II) 

5.3 

5.1 

5.4 

31.7 

69.1 

77.5 

31.3 

72.1 

79.7 

(III) 

5.2 

5.2 

5.7 

43.9 

74.3 

80.2 

44.5 

74.2 

81.7 

(IV) 

4.1 

4.7 

5.5 

6.4 

54.2 

65.7 

7.3 

57.8 

68.1 

(V) 

5.9 

7.0 

5.0 

10.3 

59.9 

94.8 

9.6 

60.2 

94.7 

(VI) 

4.3 

5.1 

5.3 

21.3 

25.5 

26.4 

21.7 

25.8 

26.7 

(VII) 

4.7 

5.7 

5.4 

36.8 

41.0 

40.1 

36.3 

40.6 

40.7 


CQ, 


Chen and Qinl f 20101) ’s test; SS, 


Wang, et al. 


(120151) ’s test; OS, our proposed high 


dimensional uniformly optimal sign test. 
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more sparse and str o nger signals whereas the L 2 -norm test is for denser but fainter signals. 


Fan. Liao and Yao (120151) also proposed a power-enhancement test based on a screening 


technique. Developing a spatial-sign-based test for sparse signals is of interest in the future 
study. 


Appendix A: Scalar-invariant test 


Here we replace S in with its diagonal matrix and dehne the following test statistic 


T = 

n. 


” n{n — 1 ) 




i<j 


w here is the corresp onding diagonal matrix estimator using leave-two-out sample {Xk}kj^i 
Feng, et ah fl2015a|) . Now, is invariant under scalar transformations Xi —)■ BXj, 


B = diag{ 6 ^, • • • , 6 ^}. Dehne R = D where D is the diagonal matrix of S. Now 

the conditions (C1)-(C3) become 


(Cl') tr(R^) = o(tr^(R^)) and tr(R^) — p = o{n V^). 

(C2') z /4 = 0(h|) where h/ = E{K\fi)) and r* = ||D~^/^Xj||. 

(C3') = Oicfan), 0 ^D-V 2 rd-V 20 = o{npcfan) where Cq = E{K{fi)fi^} and 

= 2n“^p“^i>|tr(R^). 


Furthermore, we need another technical condition for the consistency of Djj. 
(C4') n“^p^/tr(R^) = 0(1) and log(p) = o(n). 


Theorem 3 Under Conditions (Cl')-(CA'), as n,p ^ oo, we have 


— -^-4Ar(0,l). 
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Correspondingly, the ratio-consistent estimator of is 


where /i.j = Xk). 

So the asymptotic power function of T„ is 

. ,MnM^ . |B{A'(f.)fr‘}lVn«m->e\ 

/ST.dlell) - 4> V2tr(R^) ) ^ 

By the Cauchy inequality, the optimal weighted function is also K(t) = t“h 


Appendix B: Technical Details 


Dehne U, = 17(X, - 6>), u, = U{e,), r* 

(boij)- 



First, we restate Lemma 4 in 


Zou et ah 


Lemma 1 Suppose u are independent identically distributed uniform on the unit p sphere. 
For any p x p symmetric matrix M, we have 

F;(u'^Mu) 2 ={tr2(M) + 2tr(M2)}/(p2 + 2p), 

F'(u^Mu)'^ ={3tr^(M^) -I- 6 tr(M'^)}/{p(p -|- 2)(p -|- 4)(p -|- 6)}. 


Bl: Proof of Theorem 1 

Obviously, E{Wn) = 0 and 

Because ||Xj|p = efSe* = ef Si+eJ{ll—lp)£i and (S—Ip)£j} = Ed |£j| p)p“^{tr(S^) — 

p}. So ||Xd| = ||£i||(l + Op(l)). Similarly, 14, = Op(l)). Thus, 

var(fK) =2n-2E{E2(r;)JF2(r*)(ufSu,)2}(l + o(l)) 
=2n~‘^p~‘^vlii{Y,^){l + o(l)). 
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Thus, we only need to proof the normality of Wn- Define Wnk = Y^i =2 where Zni = 
Vi = K{ri)\],. Let A = E{y,Vj). Let - ,V,} be 

the cr-£eld generated by {Vj^j < i}. Obviously, E{Zni \ En,i-i) = 0 and it follows that 
{Wnk,En,k'-,‘^ < /c < 77,} is a zero mean martingale. The central limit theorem (Hall and 
Hyde, 1980) will hold if we can show 




Al. 




( 2 ) 


and for any e > 0, 


i=2 


It can be shown that 


i=2 


n j-1 


n?{n — 1)2 ^ ^ 

^ > j=2 i=l 


EEHav 


n j-1 j-1 




n'^in — 1)2 ^ 

' ' j=2 ll<l2 


(3) 


—Cnl + Cn2 


Obviously, E{Cni) = „(-„^_i) tr(A^) = cr^(l + o(l)) by the calculation of var(lT„). And 
var(0„i) = 0(n“^)var((VfAVj)2). According to Lemma 1, we have var((Vf AVj)^) = 
0(tr^(A^) +tr(A'‘)). Thus, by Condition (Cl), we have var(C„i) = 0(n“®)tr2(A^) = o(cr^). 
Thus, C„i/(T^Ai. Similarly, EiC'^^) = 0(n“‘^)tr(A'^) = o((T^). Then ([2]) holds. Next, to 
proof ([3]), by Chebyshev’s inequality, we only need to show 


. 1=2 


Note that 


n ^ n n / J—1 

E { E = E E(Z‘i) = 0(n->) E E 

.j=2 ) j=2 j=2 \i=l 
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which can be decomposed as 3Q + P where 

n j — 1 j — 1 

Q =0{n-‘) Y, Y 'P,E{VjV.VjV,VjV,VjV,) 

j=2 s<t 
n j-1 

j=2 i=l 

Obviously, Q = 0{n~^)E{{VjAVj)'^) = 0(n“^)tr^(A^) by Lemma 1 and Condition (Cl). 
Then Q = o{a^). Similarly, we can show that P = 0(n“®)tr^(A^) = o{a^). Here we complete 
the proof. □ 


B2: Proof of Theorem 2 


By the Taylor expansion, we have 


U{Xi) = U, + rr^(Ip - U,u/ )9 + Op{n 




Thus, taking the same procedure as Theorem 1, we have 

n n — 1n n — 1 

' ' i<j ^ ' i<j 


n{n — 1) 


Y1 ^K{ri)K{rj)e'^6 + Op{a„ 


i<j 


And 


n{n — 1) 
by Condition (C3). Similarly, 


YK(r,)r-WJe] = O(n-y-'<;§0^Se) = o(aJ) 

i<j J 


n{n — 1) 


^K{ri)K{rj)e^e = clO'^0 + Op(a„). 


i<j 


Then, 




i<j 


According to Theorem 1, we can easily obtain the result. 


□ 
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B3: Consistency of (j 


Taking the same procedure as the proof of Theorem 2 in 


Chen and QinI (120lOh . we have 


i¥=j 

=2n-^^^(VfVj)2 + Op(a2), 

i¥=j 

by Condition (C3). According to the proof of Theorem 1, we have E{{VjVj)'^) = tr(A^) = 
p“^z/|tr(S^)(l + o(l)). So-E((T^) = (T^(l+o(l)). And var(('yfVj)^) = o(tr^A^) by Condition 
(Cl) and (C2). Thus, var((T^) = o(cr^). So Al. □ 

B4: Proof of Theorem 3 


By the Tyler’s expansion, 




=U. - (I, - U.Uf)(Dy''" - D-‘'=)U. 


+ fr‘(ip-UiUnD-'''2e 


+ oJn 


-I'l 


Taking the same procedure as the proof of Theorem 1 in Feng and Sun (2015), by Conditions 
(Cf), (C2') and (C4'), we have 
2 




i<j 


+ E E^(’=-)"4uJ (t - u.upD-'/^e 


i<j 


+ 


n{n — 1) 


^ J2K{fj)f-'Vj (I, - UjU[)D'‘/“0 


i<j 


+ „(„_i) E - u.up(i, - u,uJ)D-‘''"e 


+ oJn 


- 2 \ 


i<j 


—Tnl + Tn2 + T)i3 + T^A- 


By the same arguments as the proof of Theorem 1, we have 


Tnl/^n —t A^(0, 1). 
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and 




Cq^^D ^9 + 0p{an)- 


Thus, by Condition (CS'), {Tn — Cg^^D ^0)/d„ A iV(0,1). 


□ 
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